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Remarks/Arguments: 

These amendments and remarks are in response to the Office Action dated April 

17, 2007. 

Claims 25, 29, and 31 are amended to clarify that the claimed fragments consist of "15 
[20 in claim 31] or more contiguous amino acids" in lieu of "at least 15 [20] amino acids." No 
new matter is added by these amendments. 

Claims 25, 29, 31, 35, 40, 41, 43, 50, 51, and 57-59 stand rejected under 35 USC 
Section 112 as lacking an adequate written description and lacking enablement in the 
specification. Applicants traverse these rejections for the reasons discussed below. 

Written description 

The Office Action appropriately quotes Regents of the University of California, v. Lilly, 
119 F.3d 1559, 1567 (Fed. Cir. 1997), for setting out the standards for a written description of 
a polynucleotide or polypeptide sequence, that is, "a precise definition, such as by structure, 
formula, [or] chemical name." In addition, the MPEP explains that "[f]or some biomolecules, 
examples of identifying characteristics include a sequence , structure, binding affinity, binding 
specificity, molecular weight, and length," (emphasis added), MPEP Section 2163(2)(A)(3). 
"Disclosure of any combination of such identifying characteristics that distinguish the claimed 
invention from other materials and would lead one of skill in the art to the conclusion that the 
applicant was in possession of the claimed species is sufficient." MPEP Section 
2163(2)(A)(3)(i). The Office Action also correctly notes, on pages 3-4 that polynucleotides and 
polypeptides cannot be described by function alone, but require some form of sequence or 
structural description, and states, on page 4 that "'description may be achieved by means of a 
recitation of a representative number of cDNAs, defined by nucleotide sequence...'" (also 
quoting Lilly). 

Applicants respectfully point out that the listing of SEQ ID NO: 2 in the specification 
suffices to meet the written description requirement for "at least 15 contiguous amino acids of 
SEQ ID NO:2." The number of possible fragments of 15 or more contiguous amino acids within 
SEQ ID NO:2 is finite, and all such fragments are described by sequence in the listing of SEQ ID 
NO:2. To further clarify this issue, Applicants have amended claims 25 and 29 to recite "15 or 
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more contiguous amino acids" in lieu of "at least 15 contiguous amino acids." Claim 31 is 
similarly amended to recite "20 or more contiguous amino acids." 

However, the Office Action states on page 3 that the claimed immunogenic fragments 
are considered as "variants." Applicants urge reconsideration of this point in view of definitions 
found in the specification. The claimed fragments consist of a sequence of 15 or more amino 
acids exactly as found within SEQ ID NO:2. There can be no variation from SEQ ID NO:2 within 
these fragments, and this is why Applicants urge that they are correctly described by the listing 
of SEQ ID NO:2 (see page 4, paragraph 0016 and page 9, paragraph 0029 of specification for 
definitions). "Variants," in contrast, contain some alteration of the SEQ ID NO:2 sequence, 
such as a deletion, addition, or substitution of an amino acid (see page of 77 of specification, 
paragraph 0277, for definition). Therefore, the written description requirement has been met 
for the claimed immunogenic fragments by the description of SEQ ID NO:2. 

The claimed fusion proteins are described in the specification on page 10, paragraph 
0034 through page 11, paragraph 0038. These paragraphs specify that the fragments of claims 
25, 29, or 31 are fused to a fusion partner such as those listed in paragraphs 0035 - 0038. 
Therefore, the written description requirement has been met for the claimed fusion proteins 
through the listing of SEQ ID NO:2 and the description of fusion partners . 

The claimed immunogenic compositions are described in the specification on page 60, 
paragraph 0250 through page 71, paragraph 260. Vaccine compositions are described on page 
64, paragraphs 0226-0227 as comprising an immunogenic recombinant polypeptide (described 
by SEQ ID NO: 2) and a suitable carrier such as a pharmaceutical^ acceptable carrier. An 
adjuvant may also be present in the compostion. Accordingly, the written description 
requirement has been met for the claimed immunogenic compositions. 

For these reasons, Applicants respectfully request that the Section 112 rejections of 
claims 25, 29, 31, 35, 40, 41, 43, 50, 51, and 57-59 as not adequately described be withdrawn. 

Enablement 

With respect to enablement, the Office Action asserts that the specification does not 
enable the claims to recombinant polypeptides, fusion proteins, immunogenic compositions, and 
method of inducing an immune response insofar as they are directed to immunogenic 
fragments. 

As mentioned above, it is important to distinguish "fragments" from "variants." The 
exact amino acid sequence of the claimed fragments is known and cannot be substituted or 
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altered as with a variant. Thus, there is no ambiguity as to the exact sequence of any of the 
claimed fragments, and they are not "undefined polypeptides," as stated on page 6 of the Office 
action. 

The Office Action expresses concern that the 3-D structure of the fragments must be 
known to predict whether they are available as epitopes for antibody binding, and points out 
that Greenspan and Di Cera, Nat. Biotechnol. 17: 936, 1999, teach that "defining epitopes is 
not as easy as it seems," (OA, page 8). However, the Greenspan article is focused on the 
limitations of one particular mutation-based method, alanine scanning mutagenesis (ASM), in 
predicting epitope regions of proteins. ASM is not required for and was not used in identifying 
the claimed polypeptides and fragments and is not mentioned in the specification. Therefore, 
the Greenspan reference is not relevant to the instant claims. 

Furthermore, advances in software for protein modelling has created programs which 
can accurately predict 3-D structure and epitope regions of proteins and polypeptides. For 
example, Sali, eta/., "Three dimensional models for four mouse mast cell chymases," J. Biol. 
Chem. 268: 9023-9034, 1993, demonstrates one method of 3-D protein modeling and 
prediction of immunogenic epitopes and surface regions of the modeled proteins (courtesy copy 
enclosed). Thus, one of ordinary skill in the art would be able to predict 3-D structure and 
antigenic sites with reasonable accuracy using such methods. However, this is unnecessary for 
the claimed invention, because immunogenic/epitope regions of the polypeptides can be readily 
identified by routine screening techniques. 

As stated in In re Wands, "[t]he nature of monoclonal antibody technology is that it 
involves screening...," and "methods for obtaining and screening monoclonal antibodies were 
well known in 1980." 853 F.2d 731, 740, 737 (Fed. Cir. 1988). Furthermore, "a considerable 
amount of experimentation is permissible, if it is merely routine...". Id. at 737. Following the 
"Wands" factors, the court then determined, in In re Wands, that screening was necessary to 
produce an antibody to a particular antigen and was routine in the art of monoclonal antibody 
production. Id. at 740. Screening is routine in the production of all antibodies, not just 
monoclonal antibodies, because it is important to identify antibodies with the greatest binding 
affinity. Similarly, screening is routine in the art of vaccine production to identify antigens, 
epitopes, and antibodies with the greatest vaccine potential. 
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In paragraph 0016 (page 5), the specification describes an immunogenic fragment as 
having substantially the same immunogenic activity as the polypeptide comprising SEQ ID 
NO:2. From this description, one of skill in the art would understand that antibodies can be 
raised to each possible fragment of 15 or more contiguous amino acids of SEQ ID NO:2 and the 
binding activity of these antibodies to target molecules compared to the binding activity of 
antibodies raised against the polypeptide of SEQ ID NO: 2. In this way, the specification 
enables the production and use of the claimed immunogenic fragments. Teachings for making, 
and using antibodies from immunogenic fragments are found in the specification in paragraphs 
0186-0188 (pages 51-52). Methods for screening and using agonist and antagonist molecules, 
which include antibodies, are provided in paragraphs 0196-0199 (page 53-55) and 204-206 
(page 57). 

Applicant submits that routine screening employed in the general course of antibody and 
vaccine production may readily be used to determine whether a particular peptide fragment (or 
fusion protein containing a peptide fragment) will raise antibodies, i.e., is immunogenic or 
raises an immune response, and whether these antibodies are capable of binding to a 
polypeptide having the sequence of SEQ ID NO:2. The methods for producing and screening 
antibodies and immune compositions are well-established and well-known to those of ordinary 
skill in the art. Therefore, the experimentation required to identify immunogenic fragments 
that can bind to a polypeptide of SEQ ID NO:2 and be used in immunogenic compositions is 
merely routine, and not undue. 

Because routine, rather than undue, experimentation can be employed to identify the 
claimed immunogenic fragments, claims 25, 29, 31, 35, 40, 41, 43, 50-51, and 57-59 are 
enabled. Applicants respectfully request that the Section 112 rejections of these claims as not 
enabled be withdrawn. 
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Conclusion 



It is respectfully submitted that the claims are in condition for immediate allowance and 
a notice to this effect is solicited. The Examiner is invited to phone applicants' attorney if it is 
believed that a telephonic interview would expedite prosecution of the application. 
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Mouse mast cell protease (mMCP) 1, mMCP-2, 
mMCP-4, and mMCP-5 are serine proteases which are 
predicted to have chymotryptic specificity (chymases). 
They are bound to negatively charged heparin or chon- 
droitin sulfate proteoglycans and are stored in secre- 
tory granules. Three-dimensional (3D) models of these 
four proteases were constructed with a comparative 
molecular modeling technique based on satisfaction of 
spatial constraints. The models were used to predict 
immunogenic epitopes and surface regions that are 
likely to interact with proteoglycans. Nine potential 
antigenic segments in the four chymases were identi- 
fied on the basis of solvent accessibility, protrusion, 
flexibility, and sequence variability. These segments 
are suitable epitopes for preparation of pro tease- spe- 
cific antipeptide immunoglobulin. Two regions with 
net charges ranging from +6 to +10 at neutral pH were 
found on the surfaces of mMCP-4 and mMCP-5. The 
two regions are located far from the substrate binding 
cleft at diametrically opposite ends of the folded pro- 
teases. A strong positive electrostatic potential sur- 
rounds the two regions. Thus, they are good candidates 
for binding sites that interact with heparin proteogly- 
can in the granules of serosal mast cells. In contrast, 
mMCP-1 and mMCP-2, which are present in granules 
of mucosal mast cells that contain chondroitin sulfate, 
lack one of these regions and have a lower charge 
density in the other. The differences between the 3D 
models provide a structural basis for the selective lo- 
calization of specific chymases within mouse mast cells 
that contain different proteoglycans. 



Mast cells contain many serine proteases in their secretory 



* This work was supported in part by Grants AI-23483, AI-31599, 
HL-36110, GM-30804, and RR-05950 from the National Institutes of 
Health, by a grant from the Hyde and Watson Foundation and by a 
grant from The Jane Coffin Childs Memorial Fund for Medical 
Research. The costs of publication of this article were defrayed in 
part by the payment of page charges. This article must therefore be 
hereby marked "advertisement" in accordance with 18 U.S.C. Section 
1 734 solely to indicate this fact. 

§ Fellow of The Jane Coffin Childs Memorial Fund for Medical 
Research. 

|| Recipient of a Heald Fellowship from the Arthritis Foundation 
of Australia. 

** To whom correspondence should be addressed: Dept. of Chem- 
istry, Harvard University, 12 Oxford St, Cambridge, MA 02138. Tel.: 
617-496-4018; Fax: 617-496*3204 (or R. Stevens at Harvard Medical 
School, Seeley G. Mudd Bldg. ( Rm. 617, 250 Longwood Ave., Boston, 
MA 02115. Tel.: 617-432-1512; Fax: 617-432-0979.) 



granules. These effector cells of the immune response are the 
major source of neutral proteases in connective tissues. The 
physiologic functions of mast cell granule proteases have not 
been determined, although they have been implicated in the 
metabolism of cytokines and hormones (1-6), extracellular 
matrix proteins (7-11), metalloproteases (12), and plasma 
proteins (13-15). 

At least seven different 26-3 2 -kDa mast cell serine pro- 
teases, designated mouse mast cell protease (mMCP) 1 1 to 
mMCP-7, have been identified in the granules of mouse mast 
cells (16-24). Based on the homology arguments and the 
amino acid sequences deduced from their cDNAs and genes, 
mMCP-6 and mMCP-7 (tryptases) are predicted to have 
substrate specificity for a positively charged residue at the 
amino-terminal side of the scissile bond, whereas raMCP-1 to 
mMCP-5 (chymases) are predicted to have specificity for a 
large hydrophobic residue at the corresponding substrate po- 
sition. mMCP-1 (16) and mMCP-2 (20) are preferentially 
expressed in mucosal mast cells, a subclass of mast cells that 
increases in the intestines of helminth-infected BALB/c mice. 
In contrast, mMCP-4 (21), mMCP-5 (22), and raMCP-6 (18) 
are preferentially expressed in the mouse mast cells that reside 
in the serosal cavity of BALB/c mice. mMCP-7 is not syn- 
thesized in serosal or mucosal mast cells, but it is transiently 
expressed in in vitro differentiated mast cells derived from 
bone marrow (19). Transformed (17) and nontransformed (25, 
26) mouse mast cells have been obtained in vitro that express 
mixed protease phenotypes, raising the possibility that the 
protease phenotypes of mouse mast cells might be more 
heterogeneous in vivo than recognized previously. 

Despite being derived from distinct genes, the mast cell 
chymases are similar to each other and possess identical 
regions of up to 15 amino acid residues in length. Thus, to 
identify immunohistochemically which chymases are ex- 
pressed in a particular tissue-localized mast cell, the antipro- 
tease immunoglobulins need to be specific to regions on the 
protein surface that differ among the proteases. By means of 
an antipeptide approach, rabbit anti-mMCP-5ne-i62 immu- 
noglobulin was generated against a synthetic peptide that 
corresponds to a variable region within this protease (27). 

Many of the 26-32-kDa mouse, rat, and human mast cell 
serine proteases are exocytosed from the effector cell in fully 
active form as >10 7 Da macromolecular complexes bound to 
serglycin proteoglycans (3, 28-30). Serglycin proteoglycans, 
which are highly acidic, consist of a serine/glycine-rich pep- 

'The abbreviations used are: mMCP, mouse mast cell protease; 
chymases, mMCPs with chymotryptic specificity; rMCP, rat mast 
cell protease; 3D, three-dimensional; tryptases, mMCPs with tryptic 
specificity. 
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tide core (31, 32) whose serine residues are covalently linked 
to different types of glycosaminoglycan chains (33-37). When 
the positively charged serine proteases and carboxypeptidases 
of mast cells are bound to proteoglycans, autolysis is mini- 
mized In at least one instance, interaction with proteoglycan 
has been found to influence the substrate specificity of the 
protease (3). It is possible that binding of a protease by a 
proteoglycan increases the retention period of a protease in 
the inflammation site and shields it against inactivation by 
circulating protease inhibitors. 

Little is known about how mast cell granule proteases 
interact with serglycin proteoglycans. Because mast cell chy- 
mases are positively charged at neutral pH and because they 
do not have the Trp-Ser-X-Trp heparin-binding motif (38), 
it has been presumed that they are electrostatically bound to 
the negatively charged glycosaminoglycans of serglycin pro- 
teoglycans. Commercially prepared porcine heparin glycosa- 
minoglycan binds to numerous proteins that have a consensus 
amino acid sequence of either X-B-B*X-B-X or X-B-B-B-X- 
X-B-X, where X and B are noncharged and basic amino acids, 
respectively (39). These two patterns are a result of the 
periodicity of helices and strands, and of the requirement for 
basic residues to be exposed so that they can interact with a 
negative ligand (39). However, many other heparin-binding 
proteins exist that do not have either of these patterns (39). 
In fact, neither pattern is present in mMCP-4 (21), mMCP-5 
(22), or mast ceil carboxypeptidase A (40), even though all 
three proteases are stored in the secretory granules of mouse 
serosal mast cells (17) in complex with heparin proteoglycan 
(34). Although the predominant proteoglycan in mouse mu- 
cosal mast cells has not been identified, rat mucosal mast 
cells contain chondroitin sulfate di-B/E proteoglycan rather 
than heparin proteoglycan (41). Thus, it is likely that mMCP- 
1 and mMCP-2 are preferentially complexed to highly sulfated 
chondroitin proteoglycans. 

To understand how mMCPs interact with proteoglycans 
and antibodies, it is important to know their tertiary struc- 
ture. Although none of the mouse chymases has been purified 
and therefore no experimentally determined structures of 
mouse chymases are available, comparative molecular mod- 
eling can be used to predict their structures. It has been shown 
that the three-dimensional (3D) structure of a protein can be 
calculated if its amino acid sequence is sufficiently similar to 
that of a protein with known tertiary structure (42-46). This 
modeling technique is particularly useful when only low to 
medium resolution results are required, such as prediction of 
exposed regions that may interact with antibodies (47) and 
models of interaction based on electrostatic complementarity 
(48). Comparative modeling based on the crystallographic 
structures of homologous proteases has been applied to obtain 
3D models of rat mast cell protease (rMCP) I (49), rat mast 
cell carboxypeptidase A (50), and mast cell tryptases (24). 
These 3D models were used to identify putative heparin 
binding sites by focusing on the regions that contain many 
positively charged amino acid residues. 

In this study, we have used a method of comparative mod- 
eling that is based on the satisfaction of spatial constraints 
(51) to predict the 3D structures of four mouse mast cell 
chymases. The modeling was based on the 3D structure of 
rMCP-II determined by x-ray crystallography (49) and on its 
sequence identity of 55-75% to the mouse chymases. Although 
the overall structure of the mouse chymases is shown to be 
similar to that of rMCP-II, a detailed examination of the 
models provided additional information about the interaction 
between the mMCPs and proteoglycans. Two regions contain- 
ing a large number of Lys and Arg residues were identified on 



the faces of mMCP-4 and mMCP-5 away from the substrate- 
binding cleft of each protease. These two regions are likely to 
interact with the negatively charged heparin of the granular 
matrix, leaving the active site of the enzyme exposed for 
substrate hydrolysis. Variable regions were also identified on 
the surface of mMCP-1, mMCP-2, mMCP-4, and mMCP-5 
that would be suitable epitopes for preparation of protease- 
specific antipeptide immunoglobulins. 

MATERIALS AND METHODS 

Comparative Modeling of Mouse Mast Qell Chymases — The high- 
reaolution 3D structures of nine serine proteases (Table I) have been 
determined by x-ray crystallography and deposited in the Brookhaven 
Protein Data Bank (52); they form the data base that was used to 
predict the 3D structures of mMCP-1, mMCP-2, mMCP-4, and 
mMCP-5. Because of the size of the family, the number of insertions 
and deletions, and the poor degree of similarity between the nine 
known proteases, their structures were superimposed with the use of 
COMPARER (53, 54), a computer program that relies on a number 
of features of protein 3D structure to obtain the best match between 
their amino acid sequences. These features included residue type 
identity, residue type properties, side chain accessibility, main chain 
accessibility, hydrogen bonds, position of C e atoms, * dihedral angle, 
* dihedral angle, and local main chain direction (53). The multiple 
structural alignment of the nine known proteases and the sequences 
of mMCP-1 to mMCP*7 were then compared in the second step to 
obtain the final alignment of the whole serine protease family. The 
second sequence comparison between mMCP-1, mMCP-2, mMCP-4, 
mMCP-5 and the structurally known rMCP-II was done by hand 
because of their 55-75% sequence identities. This sequence similarity 
allowed an unambiguous alignment between the mouse and rat chy- 
mases and thus also an unambiguous alignment between the mouse 
chymases and the other eight known structures. 

A 3D model is usually most accurate if only the proteins most 
Bimilar to the sequence being modeled are used (55). To find which 
of the nine serine proteases with known 3D structure are most similar 
to each mMCP, the table of percentage sequence identities for all 
pairs of the proteins was calculated from the multiple alignment. 
This matrix was used with the KITSCH computer program (56) to 
calculate a tree that expresses the relationships among the sequences 

Table I 

Sources of structural and sequence data used in the comparative 
modeling of the mouse mast cell chymases 

The structures were obtained from the fall 1991 release of the 
Brookhaven Protein Data Bank (52). When more than one molecule 
is present in a file, the first molecule is used. The deduced amino acid 
sequences of the mMCPs were obtained from the GenBank data base 
(86), except for mMCP-1 and mMCP-7, 



Protease 


Brookhaven 
code 


Resolution 


Ref. 


A. Serine proteases with 3D 




A 




structures determined by 








x-ray crystallography 








Rat tonin 


1TON 


1.8 


87 


Porcine kallikrein 


2PKA 


2.0 


88 


Bovine trypsin 


2PTN 


1.5 


89 


Bovine chymotrypsin 


4CHA 


1.7 


90 


Porcine elastase 


3 EST 


1.6 


91 


. Rat trypsin 


1TRM 


2.3 


92 


Human neutrophil elastase 


1HNE 


1.8 


93 


rMCP-II 


3RP2 


1.9 


94 


S. griseus trypsin 


1SGT 


1.7 


95 


Name 


GenBank code 




Ref. 



B. Mouse mast cell pro- 
teases 
raMCP-1 
mMCP-2 
mMCP-4 
mMCP-5 
mMCP-6 
mMCP-7 



J05177 

M55616, M55617, M57401 
M73759, M73760 
M57625 
L00653, L00654 



23 
20 
21 
22 
18 
19 
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of serine proteases, similar to the trees used to deduce the evolution 
of protein families. In this tree, the differences between two groups 
of sequences are approximated by a vertical distance from the top of 
the tree to the highest node from which the two groups of sequences 
branch off. 

Starting with the alignment of mMCPs with the selected proteins 
of known structure, a method of comparative modeling by satisfaction 
of spatial constraints (51) was used to predict the 3D structure of 
each raMCP without further manual intervention or subjective deci- 
sions. This method is implemented in the MODELLER program. 
The spatial constraints needed to calculate the structure of the 
unknown protein are determined and then these constraints are 
maximally satisfied to obtain the 3D model. The spatial constraints 
are derived by transferring the spatial features from the structures of 
known proteins to the sequence of the unknown protein. For example, 
if there is a conserved hydrogen bond at an equivalent position in the 
3D structures of all known serine proteases, it is assumed that each 
raMCP has this hydrogen bond in its structure. Such a hypothesis 
represents a distance constraint on the 3D structure of the unknown 
protein. The more strong constraints are available, the more precise 
is the overall 3D model of the unknown protein. 

As an example of the constraints, we consider the C fl -C tt distances. 
The C«-C a distance constraints between all pairs of C a atoms are 
derived in the following way. A survey of many pairs of aligned 
protein structures shows that a conditional probability density func- 
tion for a C«-C«, distance, given an equivalent C fl -C« distance in a 
related protein determined from the sequence alignment, is a Gaus- 
sian function, 

p(d7d) -^ Mp [-i(^)] hi) 

where d* and d are the two distances and <r is standard deviation of 
the distribution. When d is known, this relationship can be used to 
constrain a". The most likely value of d' is equal to d. Since the 
standard deviation is only about 1 A, this constraint is strong. The 
precise value of the standard deviation is calculated from a polynomial 
that depends on the similarity between the two proteins, on the 
solvent exposure of the 2 residues spanning the distance, and on the 
proximity of the nearest gaps in the alignment of the two proteins. 

The side chain dihedral angles were modeled using the rotamer 
library (57) that also takes into account side chain conformations 
and types of equivalent residues in related structures (51). This 
homologue-dependent rotamer library consists of entries for each 
existing combination of residue and dihedral angle types. The entries 
were derived from the alignments of related structures by tabulating 
relative frequencies of up to three possible side chain dihedral angle 
classes for each existing combination of equivalent residue and dihe- 
dral angle types in a related protein. For example, the probabilities 
of conformations t, and + for a Ser xi angle are 0.6, 0.2, and 0.2, 
respectively, if there is a Cys residue in conformation — at an 
equivalent position in a related structure. Once these weights are 
determined, the probability density function for the side chain dihe- 
dral angle can be modeled by a weighted sum of three Gaussian 
distributions with means and standard deviations corresponding to 
the observed distribution of this dihedral angle in the known protein 
structures. 

Similar analyses of alignments of related structures gave the prob- 
ability density functions for several other spatial features, including 
main chain dihedral angles * and and distances between all pairs 
of main chain N and O atoms (51). The numbers of constraints for 
each of the 16 features used to model mMCP-5 are listed in Table II. 
The initial structure for the sequence to be modeled is a polypeptide 
chain, consisting of all non-hydrogen atoms, with random main chain 
and side chain dihedral angles. The final model is obtained by 
iteratively changing the structure to satisfy optimally all the individ- 
ual constraints combined in the molecular probability density func- 
tion. The optimization procedure employed in MODELLER is the 
variable target function method (58). This stage of comparative 
modeling is technically similar to the refinement of protein 3D 
structures from the distance and dihedral angle constraints obtained 
from multidimensional nuclear magnetic resonance spectroscopy. 
Five slightly different models were calculated for each mMCP by 
using different initial conformations; the root mean square difference 
for superposition of these models was generally less than 0.2 A. The 
structure with the highest value of the molecular probability density 
function was selected as the representative model; the deviations 



Table II 
Constraints used to model mMCP-5 



Type 


Number* 


Root mean 
square* 


Bond lengths 


1,822 


0.005 A 


Bond angles 


2,465 


1.96° 


Dihedral angles' 


994 


3.14" 


Van der Waals contacts* 


559 


0.02 A 


Co-Cfl distances 


11,294 


0,05 A 


iviuin criaiii iN—w tuoumccB 


3 280 


0.14 A 


Main chain $ dihedral angles 


225 


2L5* 


Main chain ¥ dihedral angles 


225 


21.3' 


Side chain xi dihedral angles 


181 


9.r 


Side chain xi dihedral angles 


135 


9.6' 


Side chain xa dihedral angles 


54 


14.5* 


Side chain x* dihedral angles 


29 


12.2' 


Disulfide bridge bonds 


3 


0.003 A 


Disulfide bridge angles 


6 


2.28' 


Disulfide bridge dihedral angles 


3 


15.7' 


cw-Peptides* 


1 


4.0' 



• Number of constraints of a given type that were used to model 



mMCP-5. 

* Deviation between the actual values and the most likely values. 

'These dihedral angles constrain the planarity of peptide bonds 
and rings as well as chirality of the chiral carbon atoms. 

d All pairs of atoms that are not constrained by any of the bond, 
or bond angle terms, are constrained by the minimal contact distance. 
The number of pairs that violate this constraint in the final model is 
listed. 

# The 205-206 ( M4-225) peptide bond in rMCP-H is in the cis confor- 
mation. Since the mMCP-5 sequence also has a Pro residue at an 
equivalent position, it was constrained to a cis conformation. 

from the most likely values of the constraints for the mMCP-5 model 
are given in Table II. 

Electrostatic Potential of Mouse Mast Cell Chymases— Electrostatic 
terms in the potential energy often give rise to specific interactions 
in complexes (e.g. that between a Lys and a sulfate at contact 
distance). However, for trying to understand or to predict the nature 
of a complex between two macromolecules it is often useful to look 
at the global electrostatic potential of the two ligands involved. If the 
structure of only one ligand is known it is particularly helpful to 
examine its electrostatic potential for possible binding sites of the 
other ligand. This is true in the present case where the interaction 
between a positive (the protein) and a negative (the glycosaminogly* 
can) polyion is considered and the detailed structure of the latter is 
not available. 

The electrostatic potential of each mMCP was calculated with 
UHBD 2.2 (59), a computer program that uses the finite difference 
method to solve the linearized Poisson-Boltzmann equation. Solvent 
and protein are treated as two dielectric continuums with different 
dielectric constants. Calculation of the electrostatic potential of a 
protein can be sensitive to details of the protein model, nature of the 
solvent, and the parameters in the electrostatic model. Therefore, the 
potential surfaces were calculated under a number of different con- 
ditions to determine which features of the potential were conserved 
and thus most likely to be correct. The ionic strength of the aqueous 
solvent was varied between 1 and 100 mM and temperature was set 
to 300 K. The relative dielectric constant used for the protein was in 
the range from 2 to 10. The grid size for the calculation varied 
between 1 and 2 A and the box size between 100 and 150 A, approx- 
imately three times the diameter of the molecule. Three types of 
atomic models were used: a MODELLER model consisting of all 
heavy atoms, but no hydrogens; a MODELLER model with hydrogen 
atoms added by CHARMM (60); and an average MODELLER model 
obtained by averaging the positions of side chain heavy atoms. The 
basis for using the average structure to calculate the average electro- 
static potential, as opposed to averaging the potentials from individ- 
ual molecules, is provided by the empirical observation that the 
electrostatic potential of an average structure from a dynamics tra- 
jectory is close to the average of potentials from individual coordinate 
sets of that trajectory (61). The side chain conformations were cal- 
culated by generating all possible side chain rotamers and then 
averaging them with the weights from the side chain probability 
density functions used to derive the model (61). Because the exocy- 
tosed serosal mast cell chymases remain bound to heparin proteogly- 
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Fig. 2. Clustering of serine proteases. The scale on the y axis 
is the percentage sequence identity. The tree shows that mMCP-l 
and mMCP-4 are most similar to rMCP-II. mMCP-2 and mMCP-5 
also belong to the same group of proteins, but they are less similar to 
rMCP-II than mMCP-l and mMCP-4. The mouse tryptases, mMCP- 
6 and mMCP-7, form a small subgroup which clusters in a separate 
group with all the remaining proteases, including trypsin and chy- 
motrypsin. Ah exception is yeast trypsin from Streptomyces griseus, 
which is an outlier in the family that otherwise contains only mam- 
malian serine proteases. 

can in an extracellular environment where the pH is approximately 
7, the net charge of -1 was assigned to each Asp and Glu residue and 
a net charge of +1 to each Lys and Arg residue. The His residues, 
which usually have pK fl of «6.5, were considered neutral as were all 
other amino acid residues. The partial atomic charges were taken 
from the CHARMM 22 force field. 2 When hydrogen atoms were 
omitted from the protein model, the charges of the remaining heavy 
atoms were corrected by adding the charges of hydrogen atoms 
covalently bound to them. In the analysis that follows, we consider 
only qualitative features of the electrostatic potential. These were 
preserved in the different models that were compared. 

Location of Potential Protease-specific Antigenic Sites in Mouse 
Mast Cell Chymases— To find continuous epitopes in the models that 
would elicit antipeptide immunoglobulins specific for an individual 
mMCP, four features of protein structure and sequence were opti- 
mized (62-64) in addition to the factors that influence cellular uptake, 
processing, and presentation of the antigen. First, such a stretch of 
residues should be on the surface of the folded protein (65). The PSA 
computer program 3 that applies the method of Richmond and Rich- 
ards (66) was used to calculate solvent-accessible areas for each 
residue. These areas were normalized to a range between 0 and 100% 
as described (53) and then used to highlight the exposed segments of 
a molecule. Second, a good epitope should not only be on the surface 
but also protrude out of it (62). A protrusion index, calculated by the 
program ELLIPSOID, was used to quantitate how far out of the 
surface a segment of residues protrudes. Third, antipeptide immu- 
noglobulins bind more strongly to mobile parts of the chain than to 
more rigid parts (67, 68). Thus, the average main chain (including N, 
C„, C, and 0 atoms) isotropic temperature factors of rMCP-II were 
examined to identify the more mobile segments in each mMCP. 
Because rMCP-II molecules A and B in the crystallographic unit cell 
(49) have virtually the same mobility, only molecule A was used. The 
three short segments of the main chain whose conformation could 
not be determined (residues 83-85(»-m», 155-157<, w . 170 h), and 224(«3»)/ 



3 A. D. MacKerell, Jr., D. Bashford, M. Bellott, R. L. Dunbrack, 
Jr., M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph, L. 
Kuchnir, K. Kuczera, F. T. K. Lau, C. Mattos, S. Michnick, T. Ngo, 
D. T. Nguyen, B. Prodhom, B. Roux, M, Schlenkrich, J. Smith, R. 
Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, and M. 
Karplus, manuscript in preparation. 

3 The computer programs PSA and ellipsoid were written by A. 
Sali and are available upon request. The MODELLER computer pro- 
gram was written by A. Sali and T. Blundell, manuscript in prepa- 
ration. For more information about MODELLER, contact A. Sali. 

* Two numbering systems are used to indicate the position of an 
amino acid residue in a sequence. The first number corresponds to 
the position of the residue in the mature protease. The subsequent 
subscript number in parentheses refers to the equivalent residue in 
chymotrypsin (Fig. I). 



presumably due to high mobility, were assigned isotropic temperature 
factors of 40 A 2 . Additionally, no correction for intermolecular con- 
tacts in the crystal had to be made, since the two loops around 
residues 28(40) and 136 ( ,ftn, that form intermolecular contacts have 
high temperature factors. Fourth, to obtain an antibody capable of 
distinguishing between individual mMCPs, the immunogen must 
have an amino acid sequence that is specific to the protein of interest. 
Moreover, a high mutation rate resulting in specific sequences is also 
correlated with antigenicity (63). The average difference for all pairs 
of residues at each position in the alignment of mMCPs was plotted 
to highlight these variable segments. The difference between two 
residue types (53) is defined as being proportional to the difference 
in residue size, hydrophobicity, refractivity index, and secondary 
structure propensities. These properties were used because they are 
statistically the most conserved combination of residue type features 
in evolution (69) and thus the most reliable indicator of the differ- 
ences among the protein structures. Before plotting accessibility, 
protrusion index, mobility, and variability for each residue position 
in a serine protease, the values were smoothed by the running average 
method; a value at position i is an average of values from positions i 
- 2 to i + 2. The four features are highly correlated and have about 
the same overall prediction success, but they do not always give the 
same prediction (62, 63). Thus, it is useful to employ them in com- 
bination and to check if there are significant deviations in the results 
for any potential epitope. 

RESULTS 

Comparative Modeling of Mouse Mast Cell Chymases— The 
amino acid sequence alignment of mMCP-l, mMCP-2 ( 
mMCP-4, mMCP-5, and rMCP-II with pancreatic chymo- 
trypsin is shown in Fig. 1. The grouping of the amino acid 
sequences of all mMCPs and nine serine proteases of known 
3D structure is shown in Fig. 2. mMCP-l, mMCP-2, mMCP- 
4, and mMCP-5 are more similar to rMCP-II than to any 
other serine protease with known 3D structure. No insertions 
or deletions are needed to align the amino acid sequences of 
these mast cell chymases with rMCP-II, except for a single 2- 
residue deletion in mMCP-2 at positions 2Ol~2O2< 22 0A. 2 zi> (Fig. 
1). Consequently, the structure of rMCP-II determined by x- 
ray crystallography (49) was chosen as the template protein 
for comparative modeling of the four mouse mast cell chy- 
mases. The amino acid sequences of mMCP-6 and mMCP-7 
were significantly different from other mMCPs and from 
rMCP-II (Fig. 2). Although the models of the 3D structure of 
mMCP-6 and mMCP-7 were not constructed in this investi- 
gation, these two tryptases were recently modeled by Johnson 
and Barton (24). mMCP-3 was not modeled because only the 
NH 2 -terminal amino acid sequence of this serosal mast cell 
protease has been determined. 

When the 3D models of mMCP-i, mMCP-2, mMCP-4, and 
rnMCP-5 were calculated, their backbones were found to be 
virtually indistinguishable from the backbone of rMCP-II 
(Fig. 3). The root mean square differences for superposition 
of the C„ atoms are generally less than 0.2 A which is similar 
to the root mean square deviation between the independently 
determined rMCP-II molecules A and B (49). The only large 
main chain differences among the mMCPs occur for the last 
2 residues at the COOH terminus (Fig. 3). These differences 
are probably a consequence of the lack of conformational 
constraints used in modeling which resulted from the absence 
of equivalent residues in rMCP-II (Fig. 1). The only signifi- 
cant differences among the four models and rMCP-II are the 
types and orientations of the side chains. The fact that 
violations of the input constraints are also small (see Table 
II for the constraints used to model mMCP-5) indicates that 
the amino acid sequences of the four mouse mast cell chy- 
mases are consistent with the 3D structure of rMCP-II. 

A strong correlation between the sequence similarity of two 
proteins and the root mean square deviation of their backbone 
structures has been described (70, 71). Because the error in 
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Fig. 3. A stereo C. plot comparing the 3D models of mMCP-l, mMCP-2, mMCP-4, and mMCP-5 with the 3D structure of 
rMCP-II. The C„ atoms of the four mMCP models are superposed on the crystallography structure of rMCP-II. The arrow indicates the 
largest difference in the main chain conformation of rMCP-II and the mMCP models. Chymotrypsin numbering is used. All black and white 
plots of protein structures were created by the program MOLSCRIPT (97). 
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Fig. 4. Identification of sequence segments that can be used to obtain protease-specific antipeptide immunoglobulins. Depicted 
are accessibility (the smoothed fractional side chain accessibility of the mMCP-5 model, measured in percentage points from 0 to 100), 
protrusion (the smoothed protrusion index for the mMCP-6 model, measured in relative units from 1 to 10), flexibility (the smoothed main 
chain isotropic temperature factor of rMCP-II, measured in A 2 ), and variability (the smoothed variability of mMCP-l, mMCP-2, mMCP-4, 
and mMCP-5, in relative units from 0 to 1), The position of peptide 146-162 in mMCP-5 (27) is indicated at the top of the figure. The thin 
horizontal lines (at arbitrary height) help to locate minima and maxima in the four curves. Antigenic segments are identified as the regions 
where at least three of the four features have a pronounced optimum; the boundaries of antigenic segments are approximate, but they are all 
between 8 and 15 residues long, which is an optimal size for an antigenic peptide (68). The nine predicted epitope segments are indicated at 
the bottom of the figure. Epitope 1 is a protruding tip of a 0-hairpin; epitopes 2, 4, and 5 are extended segments; epitopes 3 and 9 are 
protruding reverse tums; epitope 6 is a 0- hairpin; epitope 7 is a protruding tip of the a -turn -0 motif; and epitope 8 is a reverse turn. The 
residue numbers correspond to those in rMCP-II, mMCP-l, mMCP-4, and mMCP-5, rather than to those in chymotrypsin. 
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Fig, 5. Top view of the two positive regions in mMCP-4. 

All heavy atoms are shown as spheres. The Lys and Arg residues are 
colored in blue, the Asp and Glu residues are red, and the rest of the 
residues are green. The two orientations of the molecule are obtained 
from that in Fig. :t by a rotation of ±90* around a horizontal axis. All 
color plots of protein structures were created by QUANTA (Molecular 
Simulations Inc.. Waltham, MA). A, region 1; B t region 2. Both 
regions are seen as blue strips running approximately horizontally 
across the center of the plot. 

the model is usually similar to the difference between the 
template protein and the actual structure of the unknown 
protein, this relationship indicates that the error in the models 
of the four mouse mast cell chymases is approximately 0.6 A 
for the buried and 1.3 A for the exposed backbone atoms. 
Approximately 80, 80, 70, and 75% of the side chains are 
expected to have dihedral angles xi. X2, Xa, and x*, respec- 
tively, in the correct, optima. 5 Thus, there are some uncertain- 
ties in the positions of positive charges at the end of long Lys 



5 A. Sali and T. L. Blundell, unpublished findings. 



and Arg side chains on the protein surface. However, these 
have a small effect on the global features of the electrostatic 
potential considered below. 

Location of Potential Protease -specific Antigenic Sites in 
Mouse MastCe.il Chymases— To maximize the likelihood that 
a linear peptide will elicit an antibody that recognizes one 
and only one protease, the stretch of residues in the native 
protein should have maximal solvent exposure, protrusion out 
of surface, conformational mobility, and residue type varia- 
bility among the mouse mast cell chymases. A plot of these 
features for mMCP-5 is shown in Fig, 4, If accessibility and 
protrusion of mMCP-5 are replaced by those of rMCP-II. 
mMCP-1, mMCP-2, or mMCP-4, essentially the same results 
are obtained (data not shown). The nine segments that are 
predicted to be the best for obtaining protease-specific anti- 
peptide immunoglobulins are shown in Figs. 1 and 4, The two 
most favored segments, according to the four criteria used, 
correspond to residues 74-89, ft7 -iiH> (segment 3) and 196- 
207(212-2241 (segment 9). Both segments are protruding reverse 
turns. 

Location of Proteoglycan Binding Regions in mMCP-4 and 
mMCP-5— Based on their 3D models, mMCP-4 and mMCP- 
5 have two distinct regions on their surfaces that contain 
considerably more positive than negative charges (Figs. 1, 5, 
6, and Table 111). These regions are located at opposite ends 
of the molecule. The regions are in separate domains and are 
equidistant from the active site at the interface between the 
two domains of the protein. The two positively charged re- 
gions are convex strips approximately 20 A long and 10 A 
wide. The chain segments comprising these regions are not 
contiguous. Region 1 consists of Arg-12 ( .> 7 ,, two antiparallel 
strands (12<)-125 ( ku-i:w>, 1.45-149 (1S 9-ic3i), and tips of two loops 
(169-175( lft .,„ifta), 20'0-205<2i.7-M4>). Region 2 consists of a turn 
(35-39(47-m)), two anti-parallel strands (G9-75«3-iw, 93-100<im- 
i l3 )), and two turns of the COOH -terminal rv-helix (220- 
226 ( .j:,9_24m). In mMCP-4, each of these segments has 2 posi- 
tively charged residues, except for segment 93-lOOiion-im 
which has 4 of them. Both regions exploit the amphiphilic 
periodicity of surface /i-strands and extended segments to 
have every other residue in a sequence charged and on the 
surface. For example, there is a run of four positively charged 
residues at 94 U 07), 96<,om, 98< n i). and 100 (un) in region 2 of 
mMCP-5. Assembling the positive regions from secondary 
structure elements, which are rich in main chain-main chain 
hydrogen bonds, contributes to the stability of these regions 
with a large number of positive charges close to each other. 
The destabilizing effect of repulsion between like charges is 
also reduced by the strong screening effect of water, which 
diminishes charge-charge interactions on protein surfaces (72, 
73). 

Region 1 has a net charge of +10 in mMCP-4 and +9 in 
mMCP-5 (Table III); region 2 has a smaller but still large net 
charge of +8 in mMCP-4 and +(5 in mMCP-5 (Table III), 
Region 2 has a smaller net charge than region I, because there 
are more negatively charged residues in it, not because of a 
smaller number of positively charged residues. Other serine 
proteases listed in Table I have a smaller number of positive 
charges in both region 1 and region 2 (Table HI). Whether or 
not these regions bind heparin is likely to be determined by 
the overall electrostatic potential which is a sum of the 
contributions from positive and negative charges. 

Electrostatic Potential of Mouse Mast Cell Chymases— A 
stereoplot of electrostatic potential contours around mMCP- 
4 is shown in Fig. 1A. The contour maps of the electrostatic 
potential for mMCP-1, mMCP-2, mMCP-4, mMCP-5, rMCP- 
II, and chymotrypsin are compared in Fig. 8. For each mole- 
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Fig. 6. 3D structures of chymotrypsin, rMCP-II, and models of the mouse mast cell chymases. Charge distribution ia compared. 
The two areas where positive charges predominate, regions 1 and 2, and the putative active site are identified in mMCP-4. Secondary 
structure definitions were calculated by the program SSTRUC written by D. Smith. Arg and Lys side chains are drawn in full and labeled. 
Asp and Glu residues are represented by their enlarged C„ atoms colored in gray. The active site residues are also drawn. The two domains 
correspond to the upper and lower halves of the molecule. Residue numbers correspond to those in chymotrypsin. The orientation of the 
molecules is the same as in Pig. 3. 



cule, the main features of the electrostatic potential do not 
depend on the exact atomic model and parameters used in the 
calculation. For mMCP-4 and mMCP-5, almost all of the 
envelope about 10 A from the surface has a positive potential. 
However, mMCP-4 and mMCP-5 have two large bulges in 
their contour maps that are due to the large number of 
positively charged residues in regions 1 and 2. The potential 
remains as high as 0.3 kcal/electron mol at an ionic strength 
of 100 mM even 20 A away from the protein surface. Both the 
location and shape of regions 1 and 2 are the same in mMCP- 
4 and mMCP-5. The pronounced positive electrostatic poten- 
tial above regions 1 and 2 is eliminated if 3 residues in the 
center of region 1 and 2 residues in the center of region 2 are 
mutated to glutamic acid residues (Fig. 7J3). 

Region 1 in mMCP-1, mMCP-2, and rMCP-II does not 
have a pronounced positive electrostatic potential, in contrast 
to region 1 of mMCP-4 and mMCP-5 (Fig. 8). This is due to 
a smaller number of positive residues (Table III) and to an 
almost even spatial distribution of the positive and negative 
residues on the surface of these proteases (Fig. 6). In contrast, 



region 2 is present in all four mast cell chymases and in 
rMCP-II (Fig. 8). However, whereas the location of the posi- 
tive field over region 2 in mMCP-1, mMCP-2, and rMCP-II 
is almost identical, this potential covers a smaller area than 
in mMCP-4 and mMCP-5. This difference is due to a smaller 
number of positive residues and to a different spatial distri- 
bution of positive and negative residues. For example, 2 
positive residues (96 OOT ), 100 ( i 13) ) and 1 neutral residue (97 mo ») 
that are on the periphery of region 2 in mMCP-4 are negative 
residues in mMCP-1. As a result, the area covered by a strong 
positive potential above region 2 is smaller in mMCP-1. 

Although regions 1 and 2 are on the opposite sides of the 
molecule, a strip of weak positive potential covers the surface 
away from the active site, and connects the two regions (Figs. 
7 A and 8). As with regions 1 and 2, this connecting strip is 
more pronounced in mMCP-4 and mMCP-5 (Fig. 8) but is 
also present in mMCP-1, mMCP-2, and rMCP-II. In mMCP- 
4, the connecting strip includes residues 128 (13l ), 112<i2M. and 

211(230). 

None of the other serine proteases listed in Table I has a 
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Table III 

Basic and acidic residues in each serine protease (the total number 

and those in regions I and 2) 
Basic (+) residues at pH 7 consist of Lys and Arg; acidic (-) 
residues consist of Asp and Glu; His is not included The 15 serine 
proteases a re arranged in increasing order of their overall net charge. 

Whole mole- Rjjgion 1 ^pon 2 



Serine protease 





+ 




Net 


+ 




Net 


+ 




Net 


Porcine kallikrein 


13 


30 


-17 


2 


2 


0 


3 


4 


-1 


mMCP-7 


15 


25 


-10 


1 


2 


-1 


2 


0 


2 


Rat trypsin 


13 


19 


-6 


1 


4 


-3 


3 


2 


+1 


niMCP-6 


18 


23 


-5 


4 


2 


+2 


2 


3 


-1 


Hat tonin 


21 


24 


-3 


3 


3 


0 


6 


4 


+2 


S. griseus trypsin 


16 


16 


0 


1 


2 


-1 


5 


1 


+4 


Bovine chymotrypsin 


17 


14 


+3 


0 


0 


0 


4 


1 


+3 


mMCP-1 


28 


25 


+3 


6 


2 


+4 


8 


4 


+4 


rMCP-II 


25 


21 


+4 


4 


2 


+2 


7 


3 


+4 


Porcine elastase 


15 


11 


+4 


4 


1 


+3 


3 


0 


+3 


Bovine trypsin 


16 


10 


+6 


4 


2 


+2 


4 


0 


+4 


mMCP-2 


27 


20 


+7 


5 


2 


+3 


8 


2 


+6 


Human neutrophil 


19 


9 


+10 


3 


0 


+3 


1 


1 


0 


elastase 


















+6 


mMCP-5 


29 


17 


+12 


9 


0 


+9 


10 


4 


mMCP-4 


34 


19 


+15 


11 


1 


+10 


10 


2 


+8 



positive potential in regions 1 and 2 as pronounced as that of 
the four mouse mast cell chymases and rMCP-II. 

The four mouse chymases and rMCP-II have a region with 
negative electrostatic potential located close to the active site 
(Figs. 1A and 8). However, this region does not overlap with 
any of the substrate binding sites previously identified in 
complexes between serine proteases and their protein inhibi- 
tors (74). In mMCP-5, this negative region includes residues 
6(2D, 64(77), 65(78) » 135(149), 139(163), and 143<i ft 7). 

DISCUSSION 

Mammalian serine proteases are enzymes consisting of two 
domains and approximately 230 residues, with the active site 
located in the cleft at the interface between the two domains. 
Each of the two domains consists of a distorted six -stranded 
^-barrel with a buried structurally conserved core and of one 
or two helices (75). The main structural differences between 
the members of this family are in the length and conformation 
of the exposed loop segments connecting the conserved 
strands and helices. Except for the most similar pair, the 
sequence identities of protein pairs range from 26 to 42%. 
Generally, about 85% of the C« atoms can be superimposed 
with a root mean square deviation from 0.8 to 1.1 A. 

In this study, comparative molecular modeling was used to 
determine the 3D structure of raMCP-1, mMCP-2, mMCP-4, 
and mMCP-5. Because these four mMCPs were found to be 
more similar to rMCP-II than to any other serine protease 
with known 3D structure (Figs. 1 and 2), their 3D models 
were all based on the 3D structure of rMCP-II determined by 
x-ray crystallography (49). As depicted in Fig. 3, the back- 
bones of these four models are virtually indistinguishable 
from the backbone of rMCP-II; the root mean square devia- 
tion is less than 0.2 A. Like in rMCP-II (49), the amino acid 
insertions, deletions, and the loss of a disulfide bond caused 
conformational changes in the mouse mast cell chymases 
relative to chymotrypsin which created new binding sites for 
interaction with the P 3 and Pi residues of a substrate, thereby 
restricting their specificities compared with that of chymo- 
trypsin. In particular, the mast cell chymases are expected to 
have a preference for a large hydrophobic residue at P$ and 
for a hydrophobic residue at Pj\ in addition to the chymo- 
tryptic specificity for a large hydrophobic residue at Pi (49), 



Unlike the pancreatic chymotrypsin gene which resides on 
chromosome 8, the mast cell chymase genes are all clustered 
on chromosome 14. 6 Apparently, this family of genes devel- 
oped when a primordial gene encoding a protease with more 
restricted specificity than chymotrypsin underwent duplica- 
tion and divergence. 

From the 3D models of mMCP-1, mMCP-2, raMCP-4, and 
mMCP-5, nine peptide segments were identified that could 
be tested to obtain protease- specific antipeptide immunoglob- 
ulins. The degrees of accessibility, protrusion, flexibility, and 
variability of the peptide sequences were the parameters used 
for identification. The two most favored regions generally 
correspond to residues 74-89 (8 7-io2» (segment 3) and 196- 
207(212-22^ (segment 9). However, segment 1 in mMCP-2 and 
segment 3 in mMCP-1 contain Af-glycosylation sites and 
therefore may not be good epitopes for those two proteases. 
McNeil et al. (27) obtained an antibody against mMCP-5 
using as an immunogen a synthetic peptide that corresponds 
to residues 146-162 (lfl o-i77) (segment 7) of the mature protease. 
Additionally, segment 2 was used to prepare an antibody that 
reacted with mMCP-2 but not with any of the other known 
mMCPs (76). 

Many proteins with exposed basic residues bind to acidic 
proteoglycans (77). Although several 3D structures of glyco- 
saminoglycans (78) and of their protein ligands are available, 
no 3D structure of a proteoglycan-protein complex has been 
determined experimentally. Nevertheless, structural aspects 
of this interaction have been studied theoretically and indi- 
rectly by experiment (39, 79-83). It appears in some cases 
{e.g. in antithrombin -heparin complex) that the binding in- 
volves specific interactions (80), whereas in others (e.g. in the 
complex of heparin cofactor II with heparin or dermatan 
sulfate) more general charge effects are involved (79). 

An aim of this study was to use the 3D models of the four 
mast cell chymases to identify positively charged regions on 
their surfaces that could enable them to interact with nega- 
tively charged serglycin proteoglycans. There are two such 
potential heparin binding regions in mMCP-4 and mMCP-5. 
Their location away from the active site explains why raMCP- 
4 and mMCP-5 are still active when in complex with proteo- 
glycans. Additionally, the active site region has a net negative 
charge that is likely to repel the negatively charged heparin 
glycosaminoglycan. In contrast to mMCP-4 and mMCP-5, 
mMCP-1 and mMCP-2 have a weaker region 2 and only a 
few positively charged residues in region L Of the rat chy- 
mases, rMCP-I has both positive regions (49) and therefore 
resembles mMCP-4 and mMCP-5; rMCP-II is similar to 
mMCP*l and mMCP-2. The two positive regions in these 
mast cell chymases are different from the single proposed 
heparin binding region in thrombin (83) as well as from that 
in mast cell tryptases (24). Therefore, the absence of a larger 
number of basic residues in regions 1 and 2 in the serine 
proteases listed in Table III does not necessarily imply that 
they do not bind proteoglycans, but only that proteoglycans 
probably do not bind to regions 1 and 2. 

The importance of the large positive electrostatic potential 
in regions 1 and 2 for binding to glycosaminoglycans can be 
tested by site-directed mutagenesis. The relatively large dif- 
ferences in the total charge and distribution of positive resi- 
dues among the chymases that have a pronounced positive 
potential (Table III, Fig. 1) indicate that a single point mu- 
tation is not likely to remove the binding capacity of regions 
1 and 2, unless specific ion pair interactions are involved. A 



fl Gurish, M. F„ Nadeau, J. H., Johnson, K. R., McNeil, H. P., 
Grattan, K. M., Austen, K. F„ and Stevens, R. L. (1993) J. BioL 
Chem. 268, in press. 
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FiC. 7. Stereoplot of the electrostatic potential around native and altered mMCP-4. The Lys and Arg side chains are colored in 
blue, the Asp and Glu side chains are red, and the rest of the protein \s green. The orientation of the molecule is similar to that in Fig. 3. The 
electrostatic potential is contoured at 0.6 (light blue) and -0.6 kcal/electron mol (light red)- A, native protein. B, mMCP-4 with diminished 
electrostatic potential. The latter model is obtained by replacing the positively charged amino acid residues M5 t m h M7 Ufm , 173 n «u (region 
1), 37ueh and 97moi (region 2) with glutamic acid residues. 
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Fig. 8. Comparison of electro- 
static potentials of serine proteases. 

Contour plots for electrostatic potential 
are compared for the molecules plotted 
in Fig. 6- The molecules are in the same 
orientation as in Fig. 3. The electrostatic 
potential is contoured at levels —0.9 and 
-0.3 (dashed lines) and 0.3 and 0.9 kcal/ 
electron mol (continuous lines). 



mMCP-1 






mMCP-2 



mMCP-4 



mMCP-5 






model of mMCP-4 in which the two strongly positive regions 
disappeared was constructed by changing 3 positively charged 
residues in the center of region 1 and 2 residues in region 2 
to glutamic acids (Fig. IB). In this altered molecule, the net 
charge of both regions is decreased to the value typical for the 
proteases that do not bind heparin (Table III). 

Even though mMCP-4 has an amino acid sequence most 
similar to that of mMCP-1 (Figs. 1 and 2), only the proteases 
packaged in serosal- mast cell granules in complex with hepa- 
rin proteoglycans (mMCP-4, mMCP-5, and rMCP-I) have 
both regions 1 and 2 with high positive charge density. In 
contrast, proteases packaged in the chondroitin sulfate -rich 
granules of mucosal mast cells (mMCP-1, mMCP-2, and 
rMCP-II) have a significant positive potential only around 
region 2 and even that is weaker than the corresponding 
potential in the proteases of serosal mast cells. This localiza- 
tion of chymases can be interpreted as follows. The minimal 
positive charge density on a protease that is required for the 
binding of heparin is larger than that required for the binding 
of chondroitin sulfate because of the higher negative charge 
density on heparin. Evolution without the selective pressure 
for regions with high charge density tends to remove such 
regions because these regions do not have random composi- 
tion. The proteases are therefore expected to have only as 
much positive charge as is required for binding their ligands. 
Consequently, it is possible that chymases with the lower 
concentration of positive charges (mMCP-1, mMCP-2, and 
rMCP-II) evolved to bind the weaker electrolyte (chondroitin 
sulfate), and chymases with the higher concentration of 
charges (mMCP-4, mMCP-5, and rMCP-I) evolved to bind 
the stronger electrolyte (heparin). Thus, the 3D models sug- 
gest a structural explanation for the selective localization of 
specific chymases within mouse and rat mast cells that con- 
tain different proteoglycans. 

For the tryptase- heparin interaction inside the acidic gran- 
ule, it has been suggested that hietidine residues play an 
important role (24). This assumption was used to explain 
dissociation of tryptases from heparin when the pH is raised 
from 5-5 to 7.0 during exocytosis and degranulation. The 
absence of histidines in positive regions 1 and 2 of chymases 
is consistent with the fact that those proteases remain bound 
to heparin even after exocytosis. 




Fig. 9. Schematic model of the interaction of heparin with 
mMCP-4. A protease molecule interacting with two glycosaminogly- 
can chains is shown. It is also possible that the two regions are 
covered by one segment of a single glycosaminoglycan chain embrac- 
ing the whole top, left, and bottom part of the molecule; this alter- 
native would be facilitated by a strip of weak positive electrostatic 
potential connecting regions 1 and 2. A similar mode of binding to 
heparin is expected for mMCP-5 and rMCP-1. 

The crystallographic analyses of several types of polysac- 
charide chains (78) show that they exist in left-handed helical 
conformation with four, six, eight, or sixteen monosaccharides 
per turn. In the case of chondroitin sulfate with eight mono- 
saccharides per turn (code 1C4S in the Brookhaven Protein 
Data Bank), the dimension of an arc formed by a part of one 
turn of the polysaccharide helix is about 20 A, corresponding 
to four to five monosaccharides. The size of the arc matches 
the length of the specific an ti thrombin binding pentasaccha- 
ride (81). Its size and curvature are also complementary to 
the positively charged regions 1 and 2 in mMCP-4 and 
mMCP-5. Additionally, there is a charge complementarity 
between approximately seven negative charges on the heparin 
pentasaccharide and six to ten positive charges on the two 
regions in mMCP-4 and mMCP-5. The large number of 
charged residues involved is in accord with significant coop- 
erativity and high affinity binding (84, 85); the application of 
Manning's condensation model suggests that a minimum of 
five charges is required for this effect. 
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A schematic model of the interaction of mMCP-4 f mMCP- 

5, and rMCP-I with heparin is shown in Fig. 9, The model for 
interaction is not sufficiently detailed to distinguish between 
a specific electrostatic interaction that requires a certain 
oligosaccharide sequence and an interaction that relies on 
charge density without many steric restrictions. Nevertheless, 
the model can serve as the basis for informed site-directed 
mutagenesis experiments that should provide more informa- 
tion on the binding between mast cell chymases and heparin. 
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